Goto

Collaborating Authors

 lip-reading ai


Lip-Reading AI is Under Development, Under Watchful Eyes - AI Trends

#artificialintelligence

A lip-reading app from Irish startup Liopa is said to represent a breakthrough in the field of visual speech recognition (VSR), which trains AI to read lips without any audio input. Liopa's product, SRAVI (Speech Recognition App for the Voice Impaired) is a communication aid for speech-impaired patients. It is likely to be the first lip-reading AI app available for public purchase, according to an account from Vice/Motherboard. Researchers driven by a range of potential commercial applications including surveillance tools have been working for years to teach computers to lip-read, and it has proven a challenging task. Liopa is working to certify SRAVI as a Class I medical device in Europe, hoping to complete the certification by August.


Lip-Reading AI Could Help the Deaf--or Spies

#artificialintelligence

U.K.-based Deepmind has created artificial intelligence software that can read lips. An artificial intelligence (AI) program from DeepMind can read lips better than professional lip readers after reviewing thousands of hours of YouTube videos along with transcripts via machine learning. The researchers tested the program on 37 minutes of video it had not previously viewed, and it misidentified only 41% of the words. In comparison, the best previous computer method, which focuses on individual letters instead of phonemes, had a 77% word error rate, while professional lip readers had a 93% error rate in the same test, which lacked context or body language. Columbia University's Hassan Akbari says the AI, if incorporated into a phone, would enable hearing-impaired users to have a "translator" with them wherever they go.


Oxford University's lip-reading AI is more accurate than humans, but still has a way to go

#artificialintelligence

Even professional lip-readers can figure out only 20% to 60% of what a person is saying. Slight movements of a person's lips at the speed of natural speech are immensely difficult to reliably understand, especially from a distance or if the lips are obscured. And lip-reading isn't just a plot point in NCIS: It's an essential tool to understand the world for the hearing-impaired, and if automated reliably, could help millions. A new paper (pdf) from the University of Oxford (with funding from Alphabet's DeepMind) details an artificial intelligence system, called LipNet, that watches video of a person speaking and matches text to the movement of their mouth with 93.4% accuracy. The previous state of the art system operated word-by-word, and had an accuracy of 79.6%.